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ABSTRACT 


Fraudulent behaviors in Google Play, the most popular Android app market, 
fuel search rank abuse and malware proliferation. To identify malware, 
previous work has focused on app executable and permission analysis. In this 
paper, we introduce FairPlay, a novel system that discovers and leverages 
traces left behind by fraudsters, to detect both malware and apps subjected to 
search rank fraud. . Fair Play discovers hundreds of fraudulent apps that 
currently evade Google Bouncer's detection technology. 

KEYWORDS: FairPlay, Google Bouncer's detection technology, fuel search rank, 
fraudsters 


1. Introduction to Data Mining: 

Generally, data mining is the process of analyzing data from 
different perspectives and summarizing it into useful 
information information that can be used to increase 
revenue, cuts costs, or both. Data mining software is one of a 
number of analytical tools for analyzing data. It allows users 
to analyze data from many different dimensions or angles, 
categorize it, and summarize the relationships identified. 
Technically, data mining is the process of finding 
correlations or patterns among dozens of fields in large 
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relational databases. Data mining is an interdisciplinary 
subfield of computer science and statistics with an overall 
goal to extract information from a data set and transform the 
information into a comprehensible structure for further use. 
Data mining is the analysis step of the "knowledge discovery 
in databases" process or KDD. Aside from the raw analysis 
step, it also involves database and the data management 
aspects, data pre- processing, model and inference 
considerations, interestingness metric. 





2. Existing System: 

1. Google Play uses the Bouncer system to remove malware. However, out of the 7, 756 Google Play apps we 
analyzed using Virus Total, 12% (948) were flagged by at least one anti-virus tool and 2% (150) were identified 
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as malware by at least 10 tools. 

2. Sarma et al. use risk signals extracted from app permissions, e.g., rare critical permissions (RCP) and rare pairs 
of critical permissions (RPCP), to train SVM and inform users of the risks vs. benefits tradeoffs of apps. 

3. Peng et al. propose a score to measure the risk of apps, based on probabilistic generative models such as Naive 
Bayes. 

4. Yerima et al. also use features extracted from app permissions, API calls and commands extracted from the app 
executables. 


2.1. Disadvantages Of Existing System: 

1. Previous work has focused on app executable and permission analysis only. 

2. Not Efficient 

3. Lower percentage of detection rate 

4. Takes more time. 
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Proposed System: 

We propose Fair Play, a system that leverages to efficiently detect Google Play fraud and malware. Our major 
contributions are: 

To detect fraud and malware, we propose and generate relational, behavioral and linguistic features, that we 
use to train supervised learning algorithms 

We formulate the notion of co-review graphs to model reviewing relations between users. 

We develop PCF, an efficient algorithm to identify temporallyconstrained, co-review pseudo-cliques — formed 
by reviewers with substantially overlapping co-reviewing activities across short time windows. 

We use temporal dimensions of review post times to identify suspicious review spikes received by apps; we 
show that to compensate for a negative review, for an app that has rating R, a fraudster needs to post at least 
positive reviews. We also identify apps with “unbalanced" review, rating and install counts, as well as apps 
with permission request ramps. 

We use linguistic and behavioral information to (i) detect genuine reviews from which we then (ii) extract user- 
identified fraud and malware indicators. 

Advantages Of Proposed System: 

We build this work on the observation that fraudulent and malicious behaviors leave behind telltale signs on 
app markets. 

Fair Play achieves over 97% accuracy in classifying fraudulent and benign apps, and over 95% accuracy in 
classifying malware and benign apps. 

Fair Play significantly outperforms the malware indicators of Sharma et al. Furthermore, we show that malware 
often engages in search rank fraud as well. When trained on fraudulent and benign apps. Fair Play flagged as 
fraudulent more than 75% of the gold standard malware apps 
Fair Play discovers hundreds of fraudulent apps. 

Fair Play also enabled us to discover a novel, coercive review campaign attack type, where app users are 
harassed into writing a positive review for the app, and install and review other apps 


System Architecture: 



Figure 1 :System Architecture 
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Figure 2: System Architecture 
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6. Conclusion: 

We have introduced FairPlay, a system to detect both 
fraudulent and malware Google Play apps. Our experiments on 
a newly contributed longitudinal app dataset, have shown that 
a high percentage of malware is involved in search rank fraud; 
both are accurately identified by FairPlay. In addition, we 
showed FairPlay's ability to discover hundreds of apps that 
evade Google Play's detection technology, including a new type 
of coercive fraud attack. 
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